A flexible framework for deriving assertions from electronic medical records

نویسندگان

  • Kirk Roberts
  • Sanda M. Harabagiu
چکیده

OBJECTIVE This paper describes natural-language-processing techniques for two tasks: identification of medical concepts in clinical text, and classification of assertions, which indicate the existence, absence, or uncertainty of a medical problem. Because so many resources are available for processing clinical texts, there is interest in developing a framework in which features derived from these resources can be optimally selected for the two tasks of interest. MATERIALS AND METHODS The authors used two machine-learning (ML) classifiers: support vector machines (SVMs) and conditional random fields (CRFs). Because SVMs and CRFs can operate on a large set of features extracted from both clinical texts and external resources, the authors address the following research question: Which features need to be selected for obtaining optimal results? To this end, the authors devise feature-selection techniques which greatly reduce the amount of manual experimentation and improve performance. RESULTS The authors evaluated their approaches on the 2010 i2b2/VA challenge data. Concept extraction achieves 79.59 micro F-measure. Assertion classification achieves 93.94 micro F-measure. DISCUSSION Approaching medical concept extraction and assertion classification through ML-based techniques has the advantage of easily adapting to new data sets and new medical informatics tasks. However, ML-based techniques perform best when optimal features are selected. By devising promising feature-selection techniques, the authors obtain results that outperform the current state of the art. CONCLUSION This paper presents two ML-based approaches for processing language in the clinical texts evaluated in the 2010 i2b2/VA challenge. By using novel feature-selection methods, the techniques presented in this paper are unique among the i2b2 participants.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model Formulation: Modeling Electronic Discharge Summaries as a Simple Temporal Constraint Satisfaction Problem

OBJECTIVE To model the temporal information contained in medical narrative reports as a simple temporal constraint satisfaction problem. DESIGN A constraint satisfaction problem is defined by time points and constraints (inequalities between points). A time interval comprises a pair of points and a constraint. Five complete electronic discharge summaries and paragraphs from 226 other discharg...

متن کامل

Automated identification of medical concepts and assertions in medical text.

This paper describes a machine learning, text processing approach that allows the extraction of key medical information from unstructured text in Electronic Medical Records. The approach utilizes a novel text representation that shares the simplicity of the widely used bag-of-words representation, but can also represent some form of semantic information in the text. The large dimensionality of ...

متن کامل

Developing a Standardized Medical Speech Recognition Database for Reconstructive Hand Surgery

Fast and holistic access to the patients’ clinical record is a major requirement of modern medical decision support systems (DSS). While electronic health records (EHRs) have replaced the traditional paper-based records in most healthcare organization, the data entry into these systems remains largely manual. Speech recognition technology promises substitution of the more convenient speech-base...

متن کامل

How to Standardize Electronic Medical Records

Introduction: One of the key elements of success of health institutions is Standardization. This study introduces the methods and stages of electronic medical records standardization. Methods: The present study is a narrative review of the studies on the stages and methods of electronic medical records standardization. Results: The process of standardization of electronic medical records incl...

متن کامل

A Framework for Assessing Adherence and Persistence to Long-Term Medication

Poor adherence and persistence to long-term medication is a growing concern worldwide. Despite their importance, tools that facilitate the identification of patients who show poor adherence and persistence rates are limited. Herein we present a framework we have developed to assist in assessing adherence and persistence rates. We demonstrate the framework's features using production electronic ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of the American Medical Informatics Association : JAMIA

دوره 18 5  شماره 

صفحات  -

تاریخ انتشار 2011